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11.  AOSTRACT  (MAwnwmJOOworM  | 

This  report  summarizes  the  major  research  accomplishments  performed  under  AFOSR  Grant  j 
99-0231,  HUMAN  IMAGE  UNDERSTANDING.  An  extensive  series  of  experiments  assessing  the 
visual  priming  of  briefly  presented  images  indicate  that  the  visual  representation  that 
mediates  real-time  object  recognition  specifies  neither  the  image  edges  or  vertices  nor 
an  overall  model  of  the  object  but  an  arrangement  of  simple  volumes  (or  geons)  corre¬ 
sponding  to  the  object's  parts.  This  representation  can  be  activated  with  no  loss  in 
efficiency  when  the  image  is  projected  onto  the  retina  at  another  position,  size,  or 
orientation  in  depth  from  when  originally  viewed.  Consideration  of  these  invariances 
suggests  a  computational  basis  for  the  evolution  of  two  extrastriate  visual  systems, 
one  for  recognition  and  the  other  subserving  motor  interaction.  The  experiments  sugges 
that  it  may  be  possible  to  assess  the  functioning  of  these  systems  belmvlorally ,  that 
is,  to  split  the  cortex  horizontally,  through  a  comparison  of  performance  on  naming 
and  episodic  memory  tasks.  We  have  developed  a  neural  network  model  (Hummel  & 
Biederman,  1992)  that  captures  the  essential  characteristics  of  human  object  recogni¬ 
tion  performance.  The  model  takes  a  line  drawing  of  an  object  as  input  and  generates  a 
structural  description  which  is  then  used  for  object  classification.  The  model's 
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capacity  for  structural  description  derives  from  its  solution  to  the  dynamic 
binding  problem  of  neural  networks:  Independent  units  representing  an  object's 
parts  (in  terms  of  their  shape  attributes  and  interrelations)  are  bound 
temporarily  when  those  attributes  occur  in  conjunction  in  the  systems  input. 
Temporary  conjuctions  of  attributes  are  represented  by  synchronized  activity 
among  the  units  representing  those  attributes.  Specifically,  the  model  induces 
temporal  correlation  in  the  firing  of  activated  units  to:  a)  parse  images  into 
their  constituent  parts;  b)  bind  together  the  attributes  of  a  part;  and  c) 
edtermine  the  relations  among  the  parts  and  bind  them  to  the  parts  to  which  they 
apply.  Because  it  conjoins  independent  units  temporarily,  dynamic  binding  allows 
tremendous  economy  of  representation,  and  permits  the  representation  to  reflect  an 
object's  attribute  structure.  The  model's  recognition  performance  conforms  well 
to  recent  results  from  shape  priming  experiments.  Moreover,  the  manner  in  which 
the  model's  performance  degrades  due  to  accidental  synchrony  produced  by  an  excess 
of  phase  sets  suggests  a  basis  for  a  theory  of  visual  attention. 
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I.  Biedennan,  P.  I. 

Final  Progress  Report  for  AFOSR  Grant  No.  AFOSR>88-0221 
Psychophysical  Analyses  of  Perceptual  Representations 

Abstract 

This  report  summarizes  the  major  research  accomplishments  performed  under  AFOSR 
Grant  99-0221,  HUMAN  IMAGE  UNDERSTANDING.  An  extensive  series  of 
experiments  assessing  the  visual  priming  of  briefly  presented  images  indicate  that  the 
visual  representation  ^t  mediates  real-time  object  recognition  specifies  neither  the  image 
edges  or  vCTtices  nor  an  overall  model  of  the  object  but  an  arrangement  of  simple  volumes 
(or  geons)  corresponding  to  the  object's  parts.  This  representation  can  be  activated  with 
no  loss  in  efficiency  when  the  image  is  projected  onto  the  retina  at  another  position,  size, 
or  orientation  in  depth  from  when  originally  viewed.  Consideration  of  these  invariances 
suggests  a  computational  basis  for  the  evolution  of  two  extrastriate  visual  systems,  one 
for  recognition  and  the  other  subserving  motor  interaction.  The  experiments  suggest  that 
it  may  be  possible  to  assess  the  functioning  of  these  systems  behaviorally,  that  is,  to  split 
the  cortex  horizontally,  through  a  comparison  of  performance  on  naming  and  episc^c 
memory  tasks.  We  have  developed  a  neural  network  model  (Hummel  &  Biederman, 
1992)  that  captures  the  essential  characteristics  of  human  object  recognition  performance. 

The  model  takes  a  line  drawing  of  an  object  as  input  and  generates  a  stmctural  description 
which  is  then  used  for  object  classification.  The  model's  capacity  for  structural 
description  derives  from  its  solution  to  the  dynamic  binding  problem  of  neural  networics: 
Independent  units  representing  an  object's  parts  (in  terms  of  their  shape  attributes  and 
interrelations)  are  bound  temporarily  when  those  attributes  occur  in  conjunction  in  the 
systems  input.  Temporary  conjunctions  of  attributes  are  represented  by  synchronized 
activity  among  the  units  representing  those  attributes.  Specifically,  the  model  induces 
temporal  correlation  in  the  firing  of  activated  units  to:  a)  parse  images  into  their 
constituent  parts;  b)  bind  together  the  attributes  of  a  part;  and  c)  determine  the  relations 
among  the  parts  and  bind  them  to  the  parts  to  which  they  apply.  Because  it  conjoins 
independent  units  temporarily,  dynamic  binding  allows  tremendous  economy  of 
representation,  and  permits  the  representation  to  reflect  an  object's  attribute  structure. 

The  model's  recognition  performance  conforms  well  to  recent  results  from  shape  priming 
experiments.  Moreover,  the  manner  in  which  the  model’s  performance  degrades  due  to 
accidental  synchrony  produced  by  <m  excess  of  phase  sets  suggests  a  basis  for  a  theory  of 
visual  attention. 
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This  document  summarizes  the  major  research  contributions  to  derive  from  Air  Force 
Office  of  Scientific  Research  Grant  88-0221,  Human  Image  Understanding,  to  Irving  Biederman. 
1710  initial  section  presents  an  overall  summary,  followed  by  more  detailed  descriptions,  largely 
taken  from  the  abstracts  of  the  published  reports  from  this  work.  These  published  reports  have 
been  submitted  to  the  monitor  under  separate  cover.  The  final  section  lists  the  publications, 
presentations,  and  recognition  that  the  research  supported  by  the  grant  has  received. 

SUMMARY  OF  RESEARCH  CONTRIBUTIONS 

Consider  figure  1.  We  can  appreciate  that  the  three  images  represent  the  same  (unfamiliar) 
object,  despite  substantial  differences  in  size,  position,  and  orientation  in  depth.  We  will  refer  to 
these  variations  as  variations  in  viewpoint 


Fig.  1.  The  above  shape  is  readily  detectable  as  constant  across  the  three  views  despite  its  being 
urdamiliar. 

The  subjective  equivalence  of  the  three  images  in  figure  1  is  not  illusory.  Recent  object 
priming  studies  on  this  project  have  established  that,  indeed,  the  speed  of  object  recognition,  as 
assess^  by  visual  priming  of  naming  latencies,  is  invariant  with  translation,  scale,  and  orientation 
in  depth  (up  to  parts  occlusion)  (Biederman  &  Cooper,  in  press,  a,  b  [Appendices  A  &  B]; 
Biederman,  1991;  Gerhardstein  &  Biederman,  1991).  A  weak  form  of  invariance  that  would 
imply  that  human  observers  could  appreciate  that  the  three  objects  depicted  in  figure  1  are  the  same 
shape.  Casual  viewing,  of  the  kind  invited  in  the  first  paragraph  of  this  section,  is  sufficient  to 
document  that  such  invariance  can  be  achieved. 

There  is  a  strong  form  of  invariance,  however,  concerning  the  time  required  to  achieve  the 
equivalence,  that  surprisingly,  has  only  rarely  been  tested  in  visual  shape  recognition.  That  is,  the 
considerable  facilitation  in  the  naming  RTs  (reaction  times)  and  error  rates  in  the  naming  of  brief, 
masked  pictures  of  objects  on  a  second  block  of  trials,  presented  several  minutes  after  they  were 
named  on  a  first  block,  is  unaffected  by  a  change  in  the  position  of  the  object  relative  to  fixation 
(either  left-right  or  up-down),  its  size,  or  its  orientation  in  depth.  That  a  considerable  portion  of 
the  priming  is  visual  (and  not  just  a  function  of  activation  of  the  name  or  entry-level  concept)  is 
evidenced  by  a  large  reduction  in  priming  for  a  same  name,  different  shaped  exemplar,  as  when  a 
grand  piano  is  shown  on  the  second  block  when  initially  an  upright  piano  was  viewed. 
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These  invariances  are  so  fundamental  to  object  recognition  that  theory  in  this  domain 
consists  largely  of  explaining  how  they  could  come  about.  On  computational  grounds,  the 
invariances  seem  entirely  reasonable  in  that  the  alternative,  a  separate  representation  of  an  object  for 
each  of  its  image  manifestations,  would  require  a  prohibitively  large  number  of  representations. 
The  invariance  in  recognition  speed,  moreover,  is  inconsistent  with  the  hypothesis  (such  as  that 
advanced  by  Ullman,  1989)  that  recognition  is  achieved  through  template  transformations  for 
translating,  scaling,  or  rotating  an  image  or  template  so  as  to  place  the  two  in  coirespotidence,  as 
such  transformations  would  (presumably)  require  time  for  ^eir  execution,  not  to  mention  the 
formidable  initial  problem  of  selecting  the  appropriate  transformation  to  apply  to  an  unknown 
image. 


Now  consider  Hgure  2,  in  which  we  are  to  judge  which  one  of  the  three  stimuli  is  not  like 
the  others.  We  readily  select  3  as  different,  yet  there  is  a  much  greater  difference  in  contour  (as 
assessed  by  the  number  of  mismatching  pixels  in  the  best  match)  between  the  image  of  object  2  and 
the  other  two  images  in  that  2's  brick  is  more  elongated  than  the  bricks  of  the  other  two  objects 
(whose  bricks  are  identical).  Objects  2  and  3  have  identical  cones,  so  the  difference  is  in  the  aspect 
ratio  of  their  bricks.  This  demonstration  suggests  that  relatively  small  differences  in  contour  that 
produce  a  qualitative  difference- whether  the  tip  of  the  cone  is  pointed  or  rounded  in  the  example- 
can  have  a  more  noticeable  effect  on  classification  than  larger  (Terences  in  a  metric  property,  such 
as  aspect  ratio,  which  varies  with  viewpoint.  Our  interpretation  of  qualitative  is  "viewpoint 
invariant"  and  the  empirical  work  described  in  this  report  is,  to  a  large  extent,  concerned  with 
exploring  how  viewpoint  invariance  in  visual  shape  recognition  performance  can  be  achieved.^ 


Fig  2.  Object  A  is  judged  to  be  more  similar  to  the  standard  than  B,  but  B  is  a  closer  match  for  a 
template  model. 

We  have  developed  a  theory  to  account  for  this  capacity,  Recognition-by -Components 
(RBC)  which  posits  that,  for  purposes  of  entry  level  recognition,  objects  are  represented  as  an 
arrangements  of  convex  volumetric  primitives  such  as  bricks,  wedges,  cylinders,  cones,  lemons. 


^"Viewpoint  invariance"  can  refer  to:  a)  stability  of  certain  kinds  of  contour  information  with  changes  of  viewpoint, 
and  b)  the  lack  of  an  effect  on  performance  from  changes  in  vewpoini.  The  context  should  disambiguate  which  sense 
is  intended. 
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and  their  singly  concave  curved-axes  counterparts,  such  as  a  cylinder  with  a  curved  axis 
(Biederman,  1987),  as  illustrated  in  Figure  3.  There  are  24  primitives,  called  geons,  in  the  current 
version  of  the  theory  and  they  have  die  property  that  they  can  be  distinguished  from  a  general 
viewpoint.  For  example,  from  almost  any  viewpoint  one  would  be  able  to  distinguish  a  brick  from 
a  cylinder.  Once  two  or  three  geons  and  their  relations  are  specified,  then  almost  any  image  of  an 
object  can  be  recognized  as  an  instance  of  its  entry  level  class. 


Fig.  3  .  A  given  view  of  an  object  can  be  represented  as  an  arrangement  of  simple  primitive 
volumes,  or  geons,  of  which  five  are  shown  here.  Only  two  or  three  of  the  geons  are  needed  to 
specify  an  object 

Pam-based  recognition 

What  evidence  is  there  that  that  object  recognition  is  (simple)  parts-based?  Consideration 
of  this  question  requires  explication  of  alternatives  to  pans-based  representation.  Two  have  been 
proposed,  templates  and  lower  level  features.  Some  of  the  evidence  against  templates  are  the 
robustness  of  recognition  when  an  object  is  presented  at  a  novel  orientation  in  depth,  or  with  some 
of  its  parts  remov^,  or  with  the  addition  of  irrelevant  parts. 

Biederman  and  Cooper  (1991)  recently  reported  a  more  direct  test  of  the  alternatives.  They 
used  picture  priming  tasks  to  assess  whether  the  facilitation  of  naming  RTs  and  accuracy  on  a 
second  block  of  object  pictures  is  a  function  of  the  repetition  of  the  object's:  a)  image  features 
(viz.,  vertices  and  edges),  b)  the  object  model  (e.g.,  that  it  is  a  grand  piano),  or  c)  a  representation 
intermediate  between  a)  and  b)  consisting  of  convex  or  singly  concave  components  of  the  object, 
roughly  corresponding  to  the  object's  parts.  Subjects  viewed  pictures  with  half  their  contour 
remov^  by  either  a)  deleting  every  other  image  feature  from  each  part  (as  shown  in  Fig.  4),  or  b) 
half  the  components  (as  shown  in  Fig.  5).  On  a  second  (primed)  block  of  trials,  subjects  saw:  a) 
the  identical  image  that  they  viewed  on  the  first  block,  b)  the  complement  which  had  the  missing 
contours,  or  c)  a  same  name,  different  exemplar  of  the  object  class  (e.g.,  a  grand  piano  when  an 
upright  piano  had  been  shown  on  the  first  block).  With  deletion  of  features,  speed  and  accuracy  of 
naming  identical  and  complementary  images  were  equivalent,  indicating  that  none  of  the  priming 
could  be  attributed  to  the  features  actually  present  in  the  image.  Performance  with  both  types  of 
image  enjoyed  an  advantage  over  the  different  exemplars,  establishing  that  the  priming  was  visual, 
rather  than  verbal  or  conceptual.  With  deletion  of  the  components,  performance  with  identical 
images  was  much  better  than  their  complements.  The  latter  were  equivalent  to  the  different 
exemplars,  indicating  that  all  the  visual  priming  of  an  image  of  an  object  can  be  modeled  in  terms 
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of  a  representation  of  its  components  (in  speciHed  relations).  Alternative  explanations  are  still 
somewhat  vitd^le  and  a  portion  of  the  propo^  effort  is  directi^  toward  their  assessment. 


Coinplemefilary  imace  I 


Complemenury  lmas«  2 


Same  Name, 
DifTereni  Exemplar 


Complementary  Imaee  I 


Complementary  Imotfe  2 


Same  Name. 
OifTerent  Exemplar 


Figures  4  (left)  showing  feature  deletion  and  5  (right)  showing  parts  deletion.  First  two  columns 
in  each  panel;  Complementary  pairs  of  images  created  by  deleting  every  other  edge  and  venex 
from  each  geon  (Fig.  4)  or  half  the  pans  (Fig.  5).  Each  member  of  a  complementary  pair  had  half 
the  contour  so  that  if  the  members  of  a  pair  were  superimposed,  the  composite  would  make  for  an 
intact  picture  without  any  overlap  in  contour.  Assuming  that  the  image  in  the  left  column  was 
shown  on  the  first  block,  the  same  image  on  the  second  block  would  be  an  instance  of  identical 
priming,  the  middle  image  would  be  complementary  priming,  and  the  right  would  be  a  different 
exemplar  (same  name)  control.  For  images  of  the  type  shown  in  Fig.  4.,  identical  and 
complementary  conditions  produced  equivalent  priming,  both  of  which  were  greater  (in  priming) 
than  the  different  exemplar  condition.  For  images  of  the  type  in  Fig.  5.,  more  priming  was 
associated  with  the  identical  images  than  either  the  complementary  or  different  exemplar  images, 
which  did  not  differ  from  each  other. 

Neural  Net  Implementation  of  RBC 


These  presumed  characteristics  of  human  shape  recognition—invariant,  parts-based 
representations—have  provided  the  goals  for  a  neural  net  implementation  of  RBC  that  takes  as  its 
input  a  line  drawing  of  an  object's  orientation  and  depth  discontinuities  and  as  output  activates  a 
unit  representing  a  structural  description  that  is  invariant  with  position,  size,  and  orientation  in 
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depth  (Hununel  &  Biederman,  in  press  [Appendix  C]).  Figure  6  shows  the  overall  architecture  of 
the  mc^el  (called,  JIM). 

JIM'S  capacity  for  structural  description  derives  from  its  solution  to  the  dynamic  binding 
problem  of  neural  networks:  Independent  units  representing  an  object's  parts  (in  terms  of  their 
shape  attributes  and  interrelations)  are  bound  temporarily  when  those  attributes  occur  in 
conjunction  in  the  systems  input.  Temporary  conjunctions  of  attributes  are  represented  by 
synchronized  (or  phase  locked)  oscillatory  activity  among  the  units  representing  those  attributes. 
Specifically,  the  model  uses  phase  locking  to:  a)  parse  images  into  their  constituent  parts;  b)  bind 
together  the  attributes  of  a  part;  and  c)  determine  the  relations  among  the  parts  and  bind  them  to  the 
parts  to  which  they  apply.  Because  it  conjoins  independent  units  temporarily,  dynamic  binding 
allows  tremendous  economy  of  representation,  and  permits  the  representation  to  reflect  the  attribute 
structure  of  the  shapes  represented. 


iuuiiiumu 


L«T«r  i 


L^ym 
4  u4  S 


Uy«r  3 


UjeO 
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IBM* 
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Fig.  6.  An  overview  of  the  net's  architecture  indicating  the  representation  activated  at  each  layer. 
In  L  (layer)  3  and  above,  large  circles  indicate  cells  activated  in  response  to  the  image  and  dots 
indicate  inactive  cells.  Cells  in  LI  represent  the  edges  (specifying  discontinuities  in  surface 
orientation  and  depth).  L2  represents  the  vertices,  axes,  and  blobs  defined  by  conjunctions  of 
edges  in  LI.  L3  represents  the  geons  in  terms  of  their  defining  attributes  (Axis,  straight  or 
curved).  Cross  section  (straight  or  curved)  and  sides  (parallel  or  not  parallel)  as  well  as  coarse 
coding  of  metric  attributes  of  the  geons.  L4  and  L5  represent  the  relative  relations  among  geons. 
Cells  in  L6  respond  to  conjunctions  of  cells  in  L3  and  L5,  and  cells  in  L7  respond  to  conjunctions 
of  L6  cells. 
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Bri^  Overview  of  JIM.  As  shown  in  Figure  7,  layer  1  CL  1)  is  a  highly  simplified  version 
of  VI  in  which  a  22  X  22  array  of  spatially  arranged  columns  (roughly  analogous  to  VI 
hypercolumns),  each  with  48  cells  that  respond  to  local  lines  (differentially  to  straight  vs.  curved 
and  end-stopp^  vs.  segments  that  extend  through  the  receptive  field)  at  various  orientations.  A 
second  layer  contains  units  that  respond  to  vertices  at  various  orientations  (activated  by  the  end 
stopped  units  in  LI),  axis  of  surfaces,  and  the  general  mass  (blob)  of  a  volume.  Binding  is 
initiated  at  these  first  two  layers  by  Fast  Enabling  Links  (FELs),  connections  between  pairs  of  cells 
that  result  in  phase  locking  the  outputs  of  activated  cells  that  are  collinear  (or  cocircular),  closely 
parallel  or  coterminating.  For  example,  the  various  collinear  segment  cells  activated  by  a  line  with 
a  a  length  greater  than  the  receptive  field  diameter  of  those  cells  will  all  fire  in  synchrony. 


The  model’s  first  layer  is  At  each  location  there  are 

divided  into  22  X  22  locations.  48  cells. 


Scnieht 

1 - 


Curved 


Figure  7.  Detail  of  the  model's  first  layer.  Image  edges  are  represented  in  terms  of  their  location 
in  the  visual  field,  orientation,  curvature,  and  whether  they  terminate  within  the  cell’s  receptive 
field  or  pass  through  it. 

The  units  in  L2  activate  55  units  in  L3  that  provide  an  invariant  representation  of  the 
object's  geons  and  the  characteristics  of  these  geons  (viz.,  aspect  ratio  and  absolute  orientation 
[vertical,  horizontal,  or  oblique]).  The  phase  locking  in  the  third  layer  is  maintained  so  that  all  the 
cells  that  represent  a  particular  geon,  say  the  brick  in  Figure  7,  will  fire  synchronously  but  out  of 
phase  with  the  firing  of  another  geon  in  that  object,  say  the  cone  in  Figure  7.  Outputs  from  the 
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size,  location  and  and  orientation  units  in  L3  activate  units  in  L4  and  L5  that  represent  invariant 
relations  between  pairs  of  geons,  such  as  relative  position  (above,  below,  side  oO,  relative  size 
(larger  thw,  smaller  than,  equal  to),  or  relative  orientation  (perpendicular,  parallel,  or  diagonal  to). 
The  phase  locking  is  maintained  through  these  layers  as  well  so  that  simultaneously  arriving  L3 
and  LS  outputs  will  recruit  a  given  geon  feature  assembly  (GFA)  cell  in  L6.  Such  cells  will 
represent  a  given  geon,its  attributes,  and  its  pairwise  relations  to  other  geons,  e.g.,  that  a  brick 
(actually  a  part  with  a  straight  cross  section,  straight  axis,  and  constant  si2^  cross  section),  that  is 
horizontal,  wider  than  it  is  tall,  below,  larger  than,  and  perpendicular  to  something  else  is  present 
in  the  object.  Qosely  firing  L6  cells  will  recruit  a  given  L7  cell  which  will  represent  a  given 
object. 


Locus  of  priming.  In  the  context  of  the  model,  where  is  the  locus  of  visual  object  priming? 
The  absence  of  object  model  priming  (as  evidenced  by  the  absence  of  priming  between 
complements  with  different  parts)  suggests  that  it  cannot  be  attributed  to  residutd  activation  of  L7 
cells.  The  failure  to  find  any  contributions  from  reinstatement  of  edges  and  vertices  argues  against 
the  substrate  existing  at  LI  or  the  vertex  units  in  L2.  Moreover,  because  the  same  units  in  LI  to  L5 
are  used  repeatedly  for  different  objects,  activation  of  any  one  unit  of  a  particular  object  would  be 
readily  overwritten  by  the  activation  of  that  and  other  units  in  that  layer  by  other  objects.  In 
gener^,  the  first  five  layers  would  presumably  be  set  beyond  (if  not  before)  infancy,  so  it  would 
be  unlikely  for  priming  to  have  a  noticeable  effect  at  these  ea  iy  stages.  We  (Cooper  &  Biederman, 
1991)  tested  this  proposition  directly  by  presenting,  immediately  prior  to  the  presentation  of  an 
object,  the  single  largest  geon  from  that  object.  Such  a  prime  would  contain  the  specifications  of  a 
geon  of  the  object,  its  absolute  orientation  and  aspect  ratio  but  none  of  the  relations,  such  as  TOP- 
OF,  LARGER-THAN,  and  PERPENTICULAR  TO.  Compared  to  control  trials  in  which  the 
prime  was  not  contained  in  the  object  or  no  prime  was  presented,  no  priming  was  observed.  In 
fact,  before  this  experiment  was  done,  it  was  apparent  that  this  result  was  predicted  from  JIM.  The 
reason  for  this  is  that  a  geon  feature  assembly  cell  in  L6  has  to  have  a  high  "vigilance  parameter" 
(or  sharp  tuning  function)  if  it  is  to  distinguish  among  objects  that  contain  the  same  geons.  In 
particular,  without  the  inputs  from  the  relation  units,  L6  units  would  be  activated  by  similar  geons 
from  competing  objects.  An  analogy  can  be  made  through  a  gedanken  experiment  in  which  one 
might  attempt  to  prime  five  letter  words  with  a  single  letter.  No  priming  would  be  expected;  not 
because  letters  are  irrelevant  to  words,  but  because  distinctiveness  requires  specification  of  both  a 
particular  letter  (or  spelling  pattern  which  consists  of  a  particular  group  of  letters)  at  a  given 
position  in  a  letter  sequence. 

Priming  would  thus  be  localized  at  three  possible  sites:  a)  the  weight  matrix  for  L3  &  L5  -- 
>  L6  would  be  the  earliest  locus  where  priming  should  be  manifested,  b)  activation  of  L6  units, 
and/or  c)  the  L6  ->  L7  weight  matrix. 

SUMMARY  OF  INDIVIDUAL  PROJECTS 

The  summary  of  the  individual  research  projects  is  divided  into  three  major  sections.  The 
research  described  in  Section  I  employed  priming  to  study  the  form  of  the  representation  (Part  A) 
and  the  invariances  (Part  B).  Several  methodological  studies  have  also  been  performed  on  the 
technique  itself.  Section  11  describes  a  major  effort  has  centered  on  the  development  of  a  neural  net 
implementation  of  RBC.  Section  III  describes  research  designed  to  explore  the  cortical 
implementation  of  object  recognition,  including  some  new  work  on  patient  populations.  Major 
references  are  provided  after  each  abstract. 

1.  Studies  of  Priming: 

A.  Nature  of  the  Representation 
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1.  Priming  Contour-Deleted  Images:  Evidence  for  Intermediate  Representations 
Mediating  Visual  Object  Recognition  Rather  than  Specific  Contours  (edges  and 
vertices)  or  Subordinate  Models. 

The  speed  and  accuracy  of  perceptual  recognition  of  a  briefly  presented  picture  of  an  object 
is  facilitated  by  its  prior  presentation.  Picture  priming  tasks  were  used  to  assess  whether  the 
facilitation  is  a  function  of  the  repetition  of  :  (a)  the  object:'s  image  features  (viz.,  vertices  and 
edges),  (b)  the  object  model  (e.g..,  that  it  is  a  grand  piano),  or  (c)  a  representation  intermediate 
between  (a)  and  (b)  consisting  of  convex  or  singly  concave  components  of  the  object,  roughly 
corresponding  to  the  object's  parts.  Subjects  view^  pictures  with  half  their  contour  removed  by 
deleting  either  (a)  every  other  image  feature  from  each  part,  or  (b)  half  the  components.  On  a 
second  (primed)  block  of  trials,  subjects  saw:  (a)  the  identical  image  that  they  viewed  on  the  first 
block,  (b)  the  complement  which  had  the  missing  contours,  ch*  (c)  a  same  name-different  exemplar 
of  the  object  class  (e.g..,  a  grand  piano  when  an  upright  piano  had  been  shown  on  the  first  block). 
With  deletion  of  features,  speed  and  accuracy  of  naming  identical  and  complementary  images  were 
equivalent,  indicating  that  none  of  the  priming  could  be  attributed  to  the  features  actually  present  in 
the  image.  Performance  with  both  types  of  image  enjoyed  an  advantage  over  that  with  die  different 
exemplars,  establishing  that  the  priming  was  visual,  rather  than  verbal  or  conceptual.  With 
deledon  of  the  components,  performance  with  identical  images  was  much  better  than  that  with  their 
complements.  The  latter  were  equivalent  to  the  different  exemplars,  indicating  that  all  the  visual 
priming  of  an  image  of  an  object  is  through  the  activation  of  a  representation  of  its  components  in 
specified  relations.  In  terms  of  a  recent  neural  net  implementation  of  object  recognition  (Hummel 
&  Biederman,  In  press),  the  results  suggest  that  the  locus  of  object  priming  may  be  at  changes  in 
the  weight  matrix  for  a  geon  assembly  layer,  where  units  have  self-organized  to  represent 
combinations  of  convex  or  singly  concave  components  (or  geons)  and  their  attributes  (e.g..,  aspect 
ratio,  orientation  and  relations  with  other  geons  such  as  TOP-OF).  The  results  of  these 
experiments  provide  evidence  for  the  psychological  reality  of  intermediate  representations  in  real¬ 
time  visual  object  recognition. 

Reference 

Biederman,  I.,  &  Cooper,  E.  E.  (1991).  Priming  contour-deleted  images:  Evidence  for 
intermediate  representations  in  visual  object  recognition.  Cognitive  PsyckAogy,  23,  393-419. 

2.  Pattern  Goodness  Can  be  Understood  as  the  Working  of  the  Same  Processes 
that  Produce  Invariant  Parts  for  Purposes  of  Object  Recognition. 

Pattern  goodness,  or  pragnanz,  has  been  a  subject  of  study  and  theorizing  for  over  half  a  century 
but  its  role  in  vision  remains  uncertain.  The  traditional  theoretical  dispute  as  to  whether  goodness 
reflects  a  tendency  for  perception  to  derive  the  simplest  interpretation  of  the  stimulus  versus  the 
most  frequently  occurring  pattern  in  the  environment  is  probably  unresolvable  in  the  absence  of  a 
theory  that  defined  what  a  stimulus  was  (particularly  one  projected  from  a  three  dimensional 
object),  so  that  its  likelihood  could  be  determined,  and  the  manner  in  which  constraints  toward 
simplicity  could  or  could  not  be  regarded  as  something  extractable  from  the  regularities  of  images. 
We  argue  that  it  is  likely  that  goodness  effects  are  epiphenomenal,  reflecting  the  operation  of 
perceptual  mechanisms  designed  to  infer  a  three  dimensional  world  from  parts  segmented  from  a 
two  (fimensional  image  and  provide  descriptions  of  objects  that  can  be  recogniz^  from  a  novel 
viewpoint  or  that  are  partially  occluded.  These  perceptual  mechanisms  are  scale  sensitive  and 
include  processes  for  viewpoint-invariant  edge  characterization,  segmentation,  and  the  activation  of 
shape  repiesentations. 


Reference 
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Biederman,  L,  Hilton,  H.  J.,  &  Hummel,  J.  E.  (1991).  Pattern  goodness  and  pattern 
recognition,  Ch.  5,  P^.  73-95.  In  J.  R.  Pomerantz  &  G.  R.  Lockhead  (Eds.)  The  Perception 
of  Structure.  Washington,  D.C.:  APA. 

3.  Single  Volumes  are  Insufncient  Primes— Relations  are  Needed  as  Well. 

Subjects  are  faster  to  name  an  object  picture  with  a  basic  level  name  which  they  have  previously 
named.  Biederman  and  Cooper  (1991)  have  shown  that  the  perceptual  portion  of  this  priming 
effect  does  depend  on  the  repetition  of  the  image  features  (edges  and  vertices)  present  in  the 
original  image  or  of  the  overall  object  model,  but  rather  involves  simple  components,  often 
corresponding  to  an  object's  parts,  intermediate  between  these  two  representations.  Two 
experiments  were  conducted  to  determine  the  representational  level  at  which  priming  occurs. 
Subjects  named  objects  that  could  be  preceded  by  a  single  volume  prime  (which  could  either  be 
present  or  absent  in  the  object)  or  a  neutral  line.  No  effect  of  prime  tj^e  was  found  on  object 
naming  RTs  or  errors  even  when  the  objects'  identities  were  made  salient  by  displaying  them 
beforehand.  The  results  support  a  representational  level  specifying  an  object's  convex  components 
and  their  relations  as  the  locus  of  priming. 

Reference 

Cooper,  E.  E.,  &  Biederman,  I.  (1991).  Priming  objects  with  single  volumes.  Submitted  for 
publication. 

B.  INVARIANCE 

1.  Size  Invariance  in  Visual  Object  Priming 

Abstract.  The  magnitude  of  priming  resulting  from  the  perception  of  a  briefly  presented  picture  of 
an  object  in  an  earlier  trial  block,  as  assessed  by  naming  reaction  times  (RTs),  was  found  to  be 
independent  of  whether  the  primed  object  was  presented  at  the  same  or  a  different  size  as  when 
originally  viewed.  In  contrast,  RTs  and  error  rates  for  "same"  responses  for  old-new  shape 
judgments  were  very  much  increased  by  a  change  in  object  size  from  initial  presentation.  We 
conjecture  that  this  Association  between  the  effects  of  size  consistency  on  naming  and  old-new 
shape  recognition  may  reflect  the  differential  functioning  of  two  independent  systems  subserving 
object  memory:  one  for  representing  the  shape  of  an  object  and  the  other  for  representing  its  size, 
position,  and  orientation  (metric  attributes).  With  allowance  for  response  selection,  object  naming 
RTs  may  provide  a  relatively  pure  measure  of  the  functioning  of  the  shape  system.  Both  the  shape 
and  metric  systems  may  affect  the  feelings  of  familiarity  that  govern  old-new  episodic  shape 
judgments.  A  comparison  of  speeded  naming  and  episodic  recognition  judgments  may  provide  a 
behavioral,  noninvasive  technique  for  determining  the  neural  loci  of  these  two  systems. 

Reference 

Biederman,  I.,  &  Cooper,  E.  E.  (1992).  Scale  invariance  in  visual  object  priming.  Journal  of 
Experimental  Psychology:  Human  Perception  and  Perfonnance,  In  press. 

2.  Translational  and  Reflectional  Invariance  in  Visual  Object  Priming 
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The  magnitude  of  priming  on  naming  reaction  times  and  on  the  error  rates,  resulting  from  the 
perception  of  a  briefly  presented  picture  of  an  object  approximately  7  min  before  the  primed  object, 
was  found  to  be  independent  of  wheAer  the  primed  object  was  originally  viewed  in  the  same 
hemifleld,  left-right  or  upper-lower,  or  in  the  same  left-right  orientation.  Performance  for  same- 
name,  different-exemplar  images  was  worse  than  for  identical  images,  indicating  that  not  only  was 
there  priming  from  block  one  to  block  two,  but  that  some  of  the  priming  was  visual,  rather  than 
purely  verbd  or  conceptual.  These  results  provide  evidence  for  complete  translational  and 
reflecdonal  invariance  in  the  representation  of  objects  for  purposes  of  visual  recognition.  Explicit 
recognition  memory  for  position  and  orientation  was  above  chance,  suggesting  that  the 
representation  of  objects  for  recognition  is  independent  of  the  representations  of  the  location  and 
left-right  orientation  of  objects  in  space. 

Reference 

Biederman,  I.,  &  Cooper,  E.  E.  (1991).  Evidence  for  complete  translational  and  reflectional 
invariance  in  visual  object  priming.  Perception,  In  press. 

3.  3D  Orientation  Invariance 

People  show  little  difficulty  in  recognizing  familiar  objects  from  a  novel  orientation  in  depth,  as 
long  as  the  view  is  nonaccidental.  In  picture  priming  tasks,  naming  reaction  times  (RTs)  are 
unaffected  by  rotations  up  to  135®  from  an  originally  presented  orientation.  According  to  one 
theory,  Recognition-by-Components  (RBC)  (Biederman,  1987;  Hummel  &  Biederman,  1992), 
this  invariance  derives  from  the  employment  of  part  descriptors  (e.g.,  geons)  that  allow  member  of 
different  basic  level  classes  to  be  distinguished  from  a  general  viewpoint.  However,  a  number  of 
recent  investigations  have  documented  large  increases  for  RTs  for  "nonsense"  objects  viewed  at 
novel  orientations.  Only  by  experiencing  a  particular  object  at  a  particular  orientation  was  it 
possible  to  eliminate  the  effect  of  rotation  to  that  orientation  for  that  object.  These  results  have 
been  interpreted  as  suggesting  that  the  depth  invariance  with  familiar  objects  was  a  result  of  their 
familiarity  at  a  variety  of  orientations.  However,  in  all  the  experiments  with  nonsense  objects,  the 
stimuli  were  not  distinguishable  by  geon  type  or  first  order  relations  (such  as  TOP-OF  or  SIDE- 
CONNECTED).  For  example,  every  member  of  the  set  of  nonsense  objects  might  be  composed  of 
bricks,  of  varying  lengths,  at  right  angles  to  each  other.  One  such  object  might  be  describable  as 
having  the  third  shortest  brick  connected  to  end-tp-end  with  the  longest  brick  and  the  second 
longest  brick  connected  end-to-middle  to  the  shortest  brick.  Such  distinctions  characterize  a 
particular  subset  of  difficult  and  rare  subordinate  classifications,  more  akin  to  discriminating 
among  kinds  of  sparrows  or  different  kinds  of  tanks,  rather  than  the  basic  level  classification, 
distinguishing  sparrows  from  tanks,  for  which  RBC  was  designed.  We  tested  whether  the  depth 
invariance  for  common  objects  was  a  function  of  their  familiarity  by  constructing  sets  of  nonsense 
objects  that  were  readily  distinguishable  by  their  geons.  With  these  stimuli,  the  effects  of  depth 
rotation  previously  reported  for  nonsense  objects  were  greatly  reduced. 

Reference 

Gerhardstein,  P.  C.,  &  Biederman,  I.  3D  orientation  invariance  in  visual  object  recognition. 
Paper  presented  at  the  Meetings  of  ARVO,  Sarasota,  FL.,  May  1991. 

4.  SHAPE  INVARIANCE:  A  REVIEW  AND  FURTHER  EVIDENCE 

Abstract.  Phenomenologically,  human  shape  recognition  appears  to  be  invariant  with  changes  of 
orientation  in  depth  (up  to  parts  occlusion),  position  in  the  visual  field,  and  size.  It  is  possible  that 
these  invariances  are  achieved  through  the  application  of  transformations  such  as  rotation, 
translation,  and  scaling  of  the  image  so  that  it  can  be  matched  metrically  to  a  stored  template. 
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Presumably,  such  transformations  would  require  time  for  their  execution.  We  describe  recent 
priming  experiments  in  which  the  effects  of  a  prior  brief  presentation  of  an  image  on  its  subsequent 
recognition  is  assessed.  The  results  of  these  experiments  indicate  that  the  invariance  is  complete; 
The  magnitude  of  visual  priming  (as  distinct  from  name  or  basic  level  concept  priming)  is  not 
affected  by  a  change  in  position,  size,  orientation  in  depth,  or  lines  and  vertices,  as  long  as 
representations  of  the  same  components  can  be  activated.  An  implemented  seven  layer  neural 
network  model  (Hummel  &  Biederman,  in  press)  that  captures  these  fundamental  properties  of 
human  object  recognition  is  described.  Given  a  line  drawing  of  an  object,  the  model  activates  a 
viewpoint-invariant  structural  description  of  the  object  specifying  its  parts  and  their  interrelations. 
Visual  priming  is  interpreted  as  a  change  in  the  connection  weights  for  the  activation  of:  a)  cells 
representing  geon  feature  assemblies  (GFAs),  cells  that  conjoin  the  output  of  units  that  represent 
invariant,  independent  properties  of  a  single  geon  and  its  relations  (such  as  its  type,  aspect  ratio, 
relations  to  other  geons),  or  b)  a  change  in  the  connection  weights  by  which  sevend  GFAs  activate 
a  ceU  representing  an  object. 

Reference 

Cooper,  E.  E.,  Biederman,  I.,  &.  Hummel,  J.  E.  (1992).  Metric  invariance  in  object  recognition: 
A  review  and  additional  evidence.  Canadian  Journal  of  Psychology,  In  press. 

C.  PRIMING  METHODOLOGY 

I.  Name  and  concept  priming 

Researchers  using  object  naming  latency  to  study  perceptual  processes  in  object  recognition  may 
find  their  effects  obscured  by  variance  attributable  to  lexical  properties  of  the  object  names.  An 
experiment  was  conducted  to  determine  if  reading  the  object  names  prior  to  picture  recognition 
could  reduce  this  variance  without  interacting  with  subsequent  perceptual  processes.  Subjects 
were  divided  into  two  groups,  one  of  which  read  the  names  of  the  objects  prior  to  identification 
and  one  which  did  not.  Subjects  in  both  groups  were  required  to  name  object  pictures  as  rapidly  as 
possible.  In  a  first  block  of  trials,  subjects  identiHed  16  objects.  In  a  second  block,  subjects 
identified  32  objects,  half  of  which  were  different  shaped  examples  of  objects  viewed  on  the  first 
block  and  half  of  which  were  completely  new.  Significant  priming  was  observed  for  the  different 
shaped  examples  in  the  second  block,  but  not  for  completely  new  objects  regardless  of  which 
group  the  subject  was  in.  Further,  reading  the  names  of  objects  did  not  reduce  response  times  or 
response  time  variability  although  it  did  r^uce  the  number  of  synonymous  name  variants  subjects 
used. 

Reference 

Cooper,  E.  E.,  &  Biederman,  I.  The  Effects  of  Prior  Name  Familiarization  on  Object  Naming 
Latencies.  Unpublished  manuscript. 

II.  Neural  Net  Theory 

A.  A  Neural  Net  Implementation  of  RBC  that.  More  Generally,  Offers  a 
Solution  to  the  the  Binding  Problem. 

Upon  exposure  to  a  single  view  of  an  object,  the  human  can  readily  recognize  that  object  from  any 
other  view  that  preserves  the  parts  in  the  original  view.  Experimental  evidence  suggests  that  this 
fundamental  capacity  reflects  the  activation  of  a  viewpoint  invariant  structural  description 
specifying  the  object's  parts  and  the  relations  among  them.  This  paper  presents  a  neural  network 
inodel  of  the  process  whereby  a  structural  description  is  generated  from  a  line  drawing  of  an  object 
and  used  for  object  classification.  The  model's  capacity  for  structural  description  derives  from  its 
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solution  to  the  dynamic  binding  problem  of  neural  networks:  Independent  units  representing  an 
object's  parts  (in  terms  of  their  shape  attributes  and  interrelations)  are  bound  temporarily  when 
those  attributes  occur  in  conjunction  in  the  systems  input  Temporary  conjunctions  of  attributes  are 
represented  by  synchronize  (or  phase  locked)  oscillatory  activity  among  the  units  representing 
those  attributes.  Specifically,  the  model  uses  phase  locking  to:  a)  parse  images  into  their 
constituent  parts;  b)  bind  together  the  attributes  of  a  part;  and  c)  determine  the  relations  among  the 
parts  and  bind  them  to  the  parts  to  which  they  apply.  Because  it  conjoins  independent  units 
temporarily,  dynamic  binding  allows  tremendous  economy  of  representation,  and  permits  the 
representation  to  reflect  the  attribute  structure  of  the  shapes  represented.  The  model's  recognition 
pc^ormance  is  shown  to  conform  well  to  empirical  findings. 

Reference 

Hummel,  J.  E.,  &  Biederman,  I.  (1992).  Dynamic  binding  in  a  neural  network  for  shape 
recognition.  Psychological  Review.  In  press. 

III.  Cortical  Basis  of  Object  Recognition 

A.  Object  Recognition  without  a  Temporal  Lobe 

Is  the  temporal  lobe  required  for  high  level  object  recognition?  Individuals  with  one  temporal  lobe 
removed  (because  of  seizures)  viewed  briefly-presented  line  drawings  of  objects.  The  images 
were  presented  to  the  left  or  right  of  fixation,  so  that  they  would  be  initially  projected  to  the 
contralateral  hemisphere,  and  above  or  below  the  horizon.  The  latter  feature  of  the  presentation 
conditions  eliminated  transfer  from  V4  to  the  contralateral  temporal  lobe  though  the  corpus 
callosum.  Shape  information  should  thus  have  remained  localized  to  the  hemisphere  contralateral 
to  the  visual  field  in  which  the  image  was  shown  until  the  temporal  lobe,  where  rich  callosal 
connections  allow  transfer  to  the  other  temporal  lobe  in  a  normal  individual.  Two  kinds  of  tasks 
were  employed;  a)  naming  (and  priming),  and  b)  same-different  shape  judgments  to  a  sequentially 
presented  pair  of  pictures,  with  an  intervening  mask.  In  this  same-different  task,  a  "same"  pair 
could  be  identical  or  rotated  up  to  60®  in  depth.  "Different"  trials  used  different  exemplars  with  the 
same  name  (e.g.,  two  different  kinds  of  chairs).  In  another  same-different,  depth-rotation  task, 
nonsense  objects  composed  of  simple  volumes  were  used.  If  processes  resident  in  the  temporal 
lobe  are  critical  for  the  high  level  object  recognition  demanded  by  these  tasks,  performance  should 
have  been  much  worse  when  images  were  shown  in  the  visual  field  contralater^  to  the  hemisphere 
with  the  missing  temporal  lobe.  Other  than  a  higher  error  rate  for  naming  images  presented  to  a  left 
hemisphere  missing  its  temporal  lobe,  differences  in  performance  in  recognizing  objects  presented 
to  a  hemisphere  with  an  intact  verses  absent  temporal  lobe  were  minor. 

Reference 

Biederman,  I.,  Gerhardstein,  P.  C.,  Cooper,  E.  E.,  &  Nelson,  C.  A.  High  level  object 
recognition  without  a  temporal  lobe.  Abstract  submitted  for  presentation  at  ARVO,  1992. 

B.  Object  Recognition  and  Laterality:  Null  Effects 

In  two  experiments,  normal  subjects  named  briefly  presented  pictures  of  objects  that  were  shown 
either  to  the  left  or  to  the  right  of  fixation.  The  net  effects  attributable  to  hem^eld  were  negligible; 
Naming  RTs  were  12  msec  lower  for  pictures  shown  in  the  left  visual  field  but  error  rates  were 
slightly  lower,  by  0.8%,  for  pictures  shown  in  the  right  visual  field.  In  both  experiments,  a 
second  block  of  trials  was  run  to  assess  whether  hemifleld  effects  would  be  revealed  in  object 
priming.  Naming  RTs  to  same  name-different  shaped  exemplar  pictures  were  significantly  longer 
than  RTs  for  identical  pictures,  thus  estabhshing  that  a  component  of  the  priming  was  visud,  ratiier 
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than  only  verbal  or  conceptual,  but  hemifield  effects  on  priming  were  absent.  Allowing  for  the 
(unlikely)  possibility  that  variables  with  large  differential  left-right  hemifield  effects  may  be 
balancing  and  cancelling  each  other  out,  we  conclude  that  there  are  no  differential  hemifield  effects 
in  either  object  recognition  or  objea  priming. 

Reference 

Biederman,  I.,  &  Cooper,  E.  E.  (1991).  Object  recognition  and  laterality:  Null  Results. 
Neuropsychologia,  29,  685-694. 


C.  Unexceptional  Spatial  Memory  in  an  Exceptional  Memorist 

Rajan  Mahadevan  evidences  an  exceptional  memory  for  arrays  of  digits.  We  tested  whether 
Rajan's  spatial  memory  was  likewise  exceptional.  8  control  Ss  and  Rajan  were  instructed  to 
remember  the  position  and  orientation  of  48  images  of  common  objects  shown  either  to  the  left  or 
the  right  of  fixation  and  facing  either  left  or  right.  Rajan's  accuracy  forjudging  whether  the 
position  and  orientation  of  these  pictures  had  changed  when  they  were  shown  in  a  different 
sequence  was  lower  than  that  of  control  subjects  for  both  judgments.  Rajan's  exceptional 
memory  capacity  apparently  does  not  extend  to  spatial  relations. 

R^erence 

Biederman,  I.,  Cooper,  E.  E.,  Fox,  P.  W.,  &  Mehadevan,  R.  S.  (1992).  Unexceptional  spatial 
memory  in  an  exceptional  memorist.  Journal  of  Experimental  Psychology:  Learning,  Memory, 
and  Cognition,  in  press. 

D.  Lack  of  Attentional  Costs  in  Detecting  Visual  Transients. 

Both  spotlight  or  zoom-lens  metaphors  of  attention  predict  that  performance  should  improve  at  an 
attended  position,  and  that  this  advantage  should  decrease  as  the  area  attended  increases.  These 
assumptions  were  tested  in  a  simple  detection  task  (presence  or  absence  of  an  X)  and  a  simple 
judgment  task  (discriminating  a  dim  from  a  bright  X)  in  different  blocks  of  trials.  The  target  could 
appear  at:  a)  one  of  two  positions,  three  degrees  to  the  left  or  right  of  fixation,  b)  one  of  two 
positions,  six  degrees  to  the  left  or  right  of  fixation,  or  c)  one  of  four  positions,  three  or  six 
degrees  to  the  left  or  right  of  fixation.  Although  the  experiment  was  sufficiently  sensitive  to  detect  a 
change  in  performance  due  to  a  modest  variation  in  target  luminance,  no  effect  of  either  eccentricity 
or  number  of  possible  display  positions,  on  either  detection  or  discrimination,  was  found.  The 
lack  of  an  effect  of  the  number  of  possible  display  positions  seemed  paradoxical  given  previous 
research  (e.g.,  Posner,  Nissen,  &  Ogden,  1978)  showing  a  benefit  of  cueing  a  position.  A  third 
experiment  compared  performance  on  the  two-location  detection  task  to  performance  on  the  same 
task  when  subjects  were  cued  as  to  which  of  the  positions  was  three  times  more  likely  to  have  a 
target  than  the  other.  Reaction  times  to  targets  at  the  15%  probable  position,  though  shorter  than 
those  in  the  25%  probable  condition,  were  significantly  greater  than  in  a  50%  probable  condition, 
where  subjects  received  no  position  cue.  The  results  suggest  that  detection  of  targets  in  the 
periphery  can  occur  in  parallel,  without  an  increase  in  reaction  time  as  the  number  and  area  of 
possible  target  locations  doubles.  Funher,  the  overhead  associated  with  the  allocation  of  attention, 
within  the  conditions  of  these  experiments,  were  greater  than  any  hypothesized  benefit  from 
knowledge  of  target  locations. 
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Proceeds  in  Parallel  with  a  Net  Cost  from  Cueing.  Submitted  for  publication. 
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